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Abstract The Sixth Data Release (DR6) in the Sloan Digital Sky Survey (SDSS) provides 
more photometric regions, new features and more accurate data around globular cluster 
Palomar 5. A new method. Back Propagation Neural Network (BPNN), is used to estimate the 
probability of cluster member to detect its tidal tails. Cluster and field stars, used for training 
the networks, are extracted over a 40 x 20 deg^ field by color-magnitude diagrams (CMDs). 
The best BPNNs with two hidden layers and Levenberg-Marquardt (LM) training algorithm 
are determined by the chosen cluster and field samples. The membership probabilities of stars 
in the whole field are obtained with the BPNNs, and contour maps of the probability distri- 
bution show that a tail extends 5.42° to the north of the cluster and a tail extends 3.77° to 
the south. The whole tails are similar to those detected bv lOdenkirchen et al.l (|2003|) . but no 
longer debris of the cluster is found to the northeast of the sky. The radial density profiles 
are investigated both along the tails and near the cluster center. Quite a few substructures are 
discovered in the tails. The number density profile of the cluster is fitted with the King model 
and the tidal radius is determined as 14.28'. However, the King model cannot fit the observed 
profile at the outer regions (R > 8') because of the tidal tails generated by the tidal force. 
Luminosity functions of the cluster and the tidal tails are calculated, which confirm that the 
tails originate from Palomar 5. 

Key words: methods: statistical — Galaxy: halo — Galaxy: structure — globular cluster: 
individual (Palomar 5) 

1 INTRODUCTION 

Globular clusters (GCs) are the oldest populations in the Galaxy. Most of GCs, w hich formed in the early 
days of the Galaxy, have been destroyed by various mechanisms dWuet al.Ll2003l) . Mass loss from stellar 
evolution is very important during the first ^ 1 Gyr of cluster evolution, and most of low mass clusters 
have been dissolved during this early phase. For survival clusters, their evolutions will be dominate d by the 
intern al dynamical processes caused by encounters between cluster stars (the two-body relaxation) (ISpitzerl 
1L987'). GCs in the Galaxy have elliptical orbits, and some of th em can move into the central region of the 
Galaxy with perigalactic distances less than ~ 1 kpc (IWu et al.l i2004). When a cluster crosses the bulge or 
disk of the Galaxy with timescale shorter than its internal dynamical time, the cluster stars wi l l gain energy 
and speed up the evaporation. Such an interaction is referred to as the tidal shock fe pitzerl 1198 7*). Stars 
evaporating from the cluster due to two-body relaxation or tidal shocks, will not leave the cluster and merge 
into the Galactic field immediately. They will move along the same orbit of the cluster and form the 'tidal 

tail' of the cluster. 

iGriUmair et al.l(ll995h examined the outer structures of 12 Galactic globular clusters using star-count 
analysis with deep, two-color photographic photometry. They found that most of their sample clusters show 
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extra-tidal wings in their surface density profile s. Two- dimensional surface density maps for several clusters 
indicate the expected appearances of tidal tails. iLeoiTet al. (2000) used large-field photographic photometry 
of 20 globular clusters to investigate the presence of tidal tails around these clusters; in this study, star-count 
analysis and wavelet transform were used to detect the weak structures formed by the stars that previously 
are members of the clusters; and most of globular clusters in their sample display large and extended tidal 
tails, which exhibit proje cted directions t owa rds the Ga lactic center 

The studies of Grillmair et al. I (Il995h and lLeon et al.. (.2000) are all based on photographic observations 
covering large areas around the clusters. The low signal-to-noise in the photographic photometry and seri- 
ous contaminants from the background galaxies make the detected tidal tails in some clusters uncertain. The 
SDSS can provide large, deep CCD imaging in five passbands coving 10,000 deg^ in the sky. The SDSS 
can also separate the stars and galaxies very well and is very efficient in detecting tidal tails around globular 
clusters in the Galaxy. Using SDSS data, well-defined tidal tails in some globular clusters have be en identi- 
fied: Palomar 5 ( Odenkirchen et al., 2001, 2003; Grillmair & Dionatos, 2006), NGC5466 ( Beloku rovetaLl 
l2006HGrillma"ir & Johnsonl 120061) . and NGC 5053 (iLauchneret al.LI2006i) . 

In above mentioned studies, Palomar 5 is a prominent object. It is a remote globular cluster located at 
a distance of 23.2 kpc from the Sun and has a tidal radius of about 16.3 a rcmin (lHarrislll996l) . Using the 
ESO and SERC survey plates covering about 2.0° x 2.0° in R and J filters. iLeon et al.l ( 2000) seaixhed the 
tidal tails around this cluster and found that the detected structures outside this cluster are strongly biased 
by the background galaxy clusters appearing in the field, and it is difficult to derive any conclusions on the 
genuine location of stars in the tidal tails of this cluster. 

Using SDSS data concentrating on P alomar 5 in a region with right ascensions 226° < a < 232° 
and declinations —1.25° < S < +1.25°. [Odenkirchen et al.l (l200lb searche d the tidal tails of this cluster 
based on the empirical photometric filtering method of'Grillmai r et al.l(ll995h . They found two well-defined 
tidal tails emerging from this cluster to stretch out symmetrically to both sides of the cluster and extend an 
angle of 2.6° on the sky. Using the new SDSS data (before the public data release DRl) yielding complete 
coverage of a region with a 6.5° to 8° wide zone along the equator a nd right ascension from 224° to 
236°, and based on optimal contrast filtering method, Odenkirc hen et al.l (2003) found that the tidal tails of 
Palomar 5 have a much larger spat ial extent and can be tra ced about an arc of 10° on the sky. More recently, 
using the SDSS DR4 dat a, jGrillmair & DionatosI (l2006l) applied the optimal contrast filtering method of 
lOdenkirchen et all (l2003h to tracing the tidal tails of Palomar 5 in a region 224° < a < 247° and -3° < 

5 < +10°, and found the tidal tails to extend to some 22° on the sky. 

Most of the studies are based on star-count analysis in color-magnitude space. While iBelokurov et al.l 

(l2006l) gave us a brand new angle of view to recognize the tidal tails of clusters. Their method is based on 
an intelligent computing technique. Artificial Neural Networks (ANN), which has been applied in many 
sorts of areas, such as classification and pattern recognition. In the study of Belokurov et al. (2006), back 
propagation neural network — the most widely applied ANN — was used to estimate the probability o f 
cluster member for each object in the SDSS 5-band data space. Compared to lOdenkirchen et al.1 (l2003h . 
BPNN makes full use of the photometric information, not just one CMD. Therefore, in this paper, we will 
introduce this method to investigate the tidal tails of Palomar 5, where the DR6 data in a larger region 
(40 X 20 deg^) are used. 

In !j2] we describe the details of the SDSS DR6 and preprocessing of the observed data. Section |3] 
presents the general idea of BPNN. In Sj4] we construct BPNNs with the best performances after being 
trained with properly selected training data, and then we apply them to the tidal tails detection of Palomar 
5. Section|5]discusses the profiles and features of the tails. A brief conclusion is given in Sj6] 



2 THE STAR SAMPLE 
2.1 Observations 

In this study, the photometric data in the SDSS DR6 are used. The SDSS is a photometric and spectroscopic 
survey, providing detailed optical images covering more than a quarter of the sky and a 3-dimensional map 
of about a million galaxies and quasars. A dedicated, 2.5-meter telescope is located on Apache Point, New 
Mexico, equipped with a 120-megapixel camera and a pair of spectrographs fed by optical fibers measuring 



Tidal tail of Palomar 5 



3 



more than 600 sources in a single observation. There are 30 photometric CCDs with size 2048 x 2048 pixels 
for each. The field of view is 3.0°, and 5 broad band filters with the wavelength ranging from 3000 A to 
10000 A are used when photometric images are taken. By far, subsequent data releases have been published, 
including Early Data Release (EDR), DRl, DR2, DR3, DR4, DR5, DR6 and DR7. 

DR6 jAdelman-McCarthy et alll2008h is the first release which has significant changes about the pro- 
cessing software since DR2. For example, calibrations are improved using cross-scans to tie the photometry 
of the entire survey to each other. The photometric calibration is improved with uncertainties of roughly 
1% in g, r, i and z, and 2% in u, which are substantially better than the ones in previous data releases. In 
addition, the magnitude limits are 22.0 for u, 22.2 for g and r, 21.3 for i, and 20.5 for z. More importantly, 
compared with DR4, DR6 includes new observed regions where we can search the tidal tails of Palomar 5. 

lOdenkirchen et al] (l2003h only considered a limited region covering an area of 87 deg^ and found the 
tails extending about an angular distance of 10°. No fur ther investigation in the north , where photometric 
data exist, was made to see whether the tails are longer. iGrilknair & DionatosI (l2006h detected a 22° tidal 
tails in a larger region. Far from the center of Palomar 5, these newly dis covered tails do not appear cl early, 
since signals and the background noise are so similar. On the other hand. lGrillmair & DionatosI (|2006') used 
the DR4, in which a narrow strip (a > 228.5° and 0.5° < S < 1.5°) stretching to the north tidal tail has no 
photometric data, while DR6 supplements this area. Therefore, in an area of 40 x 20 deg^ (220° < a < 260° 
and —5° < S < 15°), DR6 provides more photometric data. Due to tremendous number of objects included 
in the ai-ea, we partition the whole field into 4 regions equally: Rl (R.A.: 220° 230°, Dec: -5° 15°); 
R2 (R.A.: 229° - 240°, Dec: -5° - 15°); R3 (R.A.: 239° ~ 250°, Dec: -5° - 15°) and R4 (R.A.: 
249° ~ 260°, Dec: —5° ~ 15°), where any two contiguous regions are overlapped by an area of 1 x 20 
deg^ in order to avoid bad smoothing at the edges (see 

The information, which we need to detect the tidal tails of Palomar 5, includes: coordinate (J2000), 
point spread function (PSF) magnitude (T'pgf) and exponential model magnitude (rexp) of r band, the 
reddening values and the type of each source. The way to separate stars from galaxies, the definition of r^^f 

and rexp, photometric and astrometric data reduction, and re l ative i nformation can be referred to the works 
dLupton et all 1200 ll; IStoughton et al.l 120021: lAbazaiia n et al.L 12004'). We use so-called Cmodel magnitudes 
(the default provided values of ugriz) as our photometric data, because Cmodel magnitude is the best fit 
of exponential and de Vaucouleurs models in each band, and it agrees excellently with both PSF magnitude 
of stars and Petrosian magnitude of galaxies. Even though we only extract stars to check the tidal tails, it's 
impossible for us to promise that there are not any miscellaneous galaxies which can not be distinguished 
by SDSS data reduction piplines. Thus, for uniformity and the validity of the photometry of both stars and 
galaxies, we choose the universal m agnitudes (Cmodel). R eddening corrections for each object are deducted 
based on the reddening values from lSchlegel et al.1 (Il998l) . 

In our selected sky field, we obtain 15,305,060 sources in the catalog, where there are 7,410,896 stars 
and 7,894,164 galaxies classified by SDSS piplines. There are 5,458,077 sources in Rl, 5,508,433 in R2, 
4,164,575 in R3 and 173,975 in R4, respectively. 

2.2 Data Preprocessing 

For source type determination in the SDSS, there are some flaws. For example, occasionally SDSS piplines 
fail to distinguish blenders and pairs of sta rs with small separation s. Sometimes, the classification scheme 
regards Seyfert galaxies or QSOs as stars dStoughton et al 1 I2002I) . and overflows of very bright stars are 
identified as galaxies. Furthermore, due to variations of observing conditions and natural differences in 
diverse fields, the completeness of object detection is fluctuant. Considering magnitude limits and the situ- 
ations depicted above, it is necessary to give cutoffs of magnitude to avoid unnecessary impurities. So we 
select the sample stars with a magnitude scope 14 < r < 22. Fig. [T] shows a visual impression of the se- 
lected star sample, and the photometric boundaries. In Fig. 1, only 1/30 sample stars are randomly selected 
to be drawn in order not to be too black. M5 and Palomar 5 are also indicated in Fig. 1. The subplot in 
this figure is the part including M5 and Palomar 5, which is enlarged. The galactocentric distance of M5 is 
about 6.2 kpc, far away from Palomar 5, whose galactocentric distance is about 18.6 kpc So M5 hardly has 
effect on Palomar 5. 
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Fig. 1: Distribution of all point sources with magnitude range 14 < r < 22 in the 40 x 20 deg^ area. This 
area are divided into four overlapped regions (Rl, R2, R3 and R4) to be processed in batches because of 
too large number of objects. M5 (center: a 229.6°, (5 = 2.1°) and Palomai- 5 (center: a = 229.02°, 
5 = -0.11°) are two star-focus regions. The smaller box (228.5° < a < 230.5°, -0.5° < 5 < 2.5°) 
encloses these two objects and the bigger one (subplot) shows the magnified image of the smaller one. The 
blank areas are regions which may be very dense clusters or bright stars or the regions the SDSS hasn't 
covered. 

Fig, m pre s ents th e interstellar extinction distribution derived based on the values of E(B — from 
ISchlegeletal1(ll998h . The resolution of the distribution is about 6 arcmin. There is smaller extinction in 
the northwest of the sky, but larger extinction in the southeast on the whole. The reddening correction in 
magnitude (E{B — V)) is 0.057 in average, and the maximum and minimum is about 0.384 and 0.016, 
respectively. All the sources in t he s ample are dereddened. 

Following teelokurov et al.l (120061) . the difference between the magnitudes obtained by PSF photometry 
and by fitting an exponential profile, namely, Tp^f — rexp distributions of both star and galaxy are plotted 
in Fig. 3. It is clear that stars are tightly concentrated around zero, whereas galaxies reveal a significant 
positive excess. Except for the cases mentioned in the previous section, classification is unauthentic when 
galaxies are point-like. PSF is the best fit model to estimate the magnitude of a point source (j'pgf), while 
exponential model is one of models to calculate the magnitude of an extended one (rexp)- SDSS constructs a 
simple classifier with the analogous difference between t^^i^ and rexp- In Fig.[3j r-p^j^- rexp is concentrated 
on zero for point-like sources, while it is far away from zero for extended so urces. Both of th e m hav e an 
intersection where stars and galaxies cannot be distinguished. Also following iBelokurov et alj (l2006h . we 
take an threshold Tp^^ — rexp = 0.05 as the division between stars and galaxies. As a result, a sample of 
4,082,662 point sources remains, which includes 1,079,301 in Rl, 1,478,788 in R2, 1,378,560 in R3 and 
146,013 in R4. 

3 BACK PROPAGATION NEURAL NETWORK 



BPNN is now applied to various areas i ncluding astronomy, such as pattern classification, face and speech 
recognition and finance dHangan. Demu th & Beale, 1996; Haykin, 1998). In astronomy, BPN N is used as 
classifier in both photometric and spectral aspects (Ivon Hippel et al.Lll994llFoIkes et al.Ul996l) or morpho- 




Fig. 2: Distribution of interstellar extinction from the Milk Way. The resolution of this image is 6 arcmin, 
and E(B-V)s used by the SDSS databse are derived from Schlegel et al.l (Il998l) . As mentioned previously, 
the blank regions are M5, bright stars and areas not observed by the SDSS. 



logica l recognizer (lOdewahn et al.l Il992t iNaim et al 1 [19951) in imaging or value estimator (iBailer-Jonesl 
|2000') in determining theoretical models and physical parameters. Virtually, the specifical mechanism of 
BPNN is as follow: first, provide a train-test data set and train the configured BPNN with them, just as a 
teacher teaches a student to tell him which is a cluster star and which is a field one; then the BPNN learns 
the knowledge again and again to modify its inner configuration and makes itself perform well. During the 
course, BPNN will be judged by test data with the learned prior knowledge; at last, through this kind of re- 
peated train-test-modify cycles, the classifier (BPNN) gains the features and has the best-learned experience 
to challenge new things. A data processing tool MATLAB, which provides a special neural network toolbox 
to design and realize all kinds of ANNs, is introduced in our work. The definition of neural networks, and 
other technical terms as well as the specified process of BPNN are described in the appendix. 

As mentioned previously, iBelokurov et al.l (l2006l) used neural networks to reconstruct the probability 
distribution of cluster stars with the SDSS ugriz photometric data. The idea of the approach is very simple: 
with ugriz 5-band photometric data of cluster members and field stars as inputs of a BPNN, we get an 
estimation of the probability of cluster member as the output after the network is best trained. This method 
makes full use of photometric information and constructs a probability estimator in high dimensional data 
space with limited resources. When being trained, the BPNN can pick out bad sources automatically to 
form an accurate separator 

With the sample picked out strictly, the first step to detect the tidal tails is to figure out all the cluster 
members of Palomar 5 in the selected field. A lot of pattern recognition techniques are available, such as 
Bayes classifier, template matching, clustering analysis and artificial neural networks (see, e.g., Sergios & 
Konstantinos 2006, and references therein). In present study, the only thing we need to do is to estimate 
the membership probability of an object. BPNN can measure the posterior probability P{C\x) in high 
dimension space, where C denotes the cluster member class and x is the photometric data vector 

At last, we make a summary of the basic parameters and components of BPNN used in our paper. First, 
the dimension of input layer is 5 (photometric magnitudes) , and the transfer function of each neuron is 
Log-Sigmoid function. Then, the dimension of output vector is one, which yields (for field stars) or 1 (for 
cluster stars). Mean squared error (MSE) is used to calculate the deviations between the output of BPNN 
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Fig. 3: The distribution diagrams of r^^f — 7-exp for both star and galaxy. Kernel density estimation is used 
to fit the distributions with normal kernel and 0.02 bandwidth jBowman et al.l [l997l) . The two distributions 
are normalized to the peak value. 



and the real object type value. Finally, the Levenberg-Marquardt Backprogation (LMBP) algorithm is used 
to train the network to minimize the performance function MSB. The initial state (such as initial weights 
and biases, condition of termination, parameters of LMBP algorithm) is given automatically by MATLAB. 



4 TIDAL TAILS DETECTION BASED ON BPNN 
4.1 Data Set Selection for Training and Test 

In order to apply the BPNN method to the tidal tails detection of Palomar 5, first of all, a training and test 
data set should be chosen to guide the BPNN to l earn the knowledge of th e cluster, so that it has the ability 
to figure out the probability of cluster member As lBelokurov et al (l2006h suggested, we select cluster stars 



from candidates using color-magnitude diagrams. 

Along the main-sequence in the CMD, stars around the center of a cluster within a proper radius are 
likely to be members. Fig. |4] shows the radial number density distribution of stars around Palomar 5. We 
can see that the density descends from the center to the external of the cluster. As i? > 0.25°, the average 
number density becomes about 1.0958 ± 0.0497 x 10^ deg^^. Radius R = 0.13°, where the density is a 
little higher than the average to avoid excessive field stars, is used to select candidates of cluster stars. About 
1523 cluster member candidates are reserved. 

The next step is to pick out the most probable cluster stars from the 1523 candidates. Fig.|5]demonstrates 
the process of selecting cluster stars as a part of the training and test data set. Objects are reserved by 
encircling the CMD of r vs r — i (Fig.|5a]i and g \s g — r(Fig. |5b]i with proper enclosures. These objects 
may be main-sequence stars, red giants and blue horizontal branch stars belonging to Palomar 5. In this 
way, contaminations from remanent galaxies and field stars (crosses in both CMDs) are almost eliminated. 
By intersection of these two CMDs, 957 objects are kept down to form the distribution of cluster stars in 
Fig. |5d] (bigger black dots). 
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Fig. 4: The radial number density distribution around the center of Palomar 5. The solid line shows that 
the density declines as the radius increases. The dashed line gives a threshold R = 0.13°. Stars within the 
threshold radius are considered as cluster member candidates. 

For field-star selection, i? > 1°, which is far enough from the center of Palomar 5, is adopted to choose 
candidates of field stars. Because the number of candidates is tremendous, it's impossible to preserve all 
of them as our data set. Ideally, we hope that roughly equal numbers of objects should lie in both sides of 
the boundary in the data space so that they can fully represent these two classes. Accordingly, we chose 
field stars randomly in the sky, satisfying roughly equal numbers of both field and cluster stars near the 
main-sequence turnoff in the CMD. Thus, a box (a white rectangle with 4 hollow circles at the vertexes in 
Fig.lSdbwith 19.5 < r < 21 and 0.05 < r — i < 0.18 is designed to form a region encircling the turnoff. 
There are about 320 stars in the box of the CMD for the cluster and field, respectively. As a result, 6991 
field stars (smaller dots in Fig. [Set and 957 cluster stars constitute the whole train-test data set (Fig.|5d]i. As 
mentioned previously, these field stars with target value and cluster stars with value 1 are transmitted to 
train a BPNN. The BPNN will configure itself to estimate the possibility of cluster member for each object. 
Although the remained data set may contain impurities more or less, they are negligible relative to the total 
group sizes. And at the same time, BPNN will get rid of them automatically through being trained again 
and again, which is one of the reasons why we consider BPNN as our method. 

4.2 Network Structure Determination 

In order to normalize the input data, we subtract the mean magnitudes of all data and then divide them by 
their standard deviations. Furthermore, for the sake of determining the number of layers and neurons in each 
layer, the star sample is segmented into training and test data sets. Training set is used to train a BPNN, 
while test set is used to measure its performance. Since cluster stars are relatively scarce, all of them are 
placed into the training set. Half field stars are laid aside stochastically into training data set and the others 
are regarded as test set. 

Ten experiments are implemented for each designed BPNN to calculate the average output as the prob- 
ability of cluster member. In this way, some random influences from initial weights, direction of modifying 
weights and algorithm terminating conditions are weakened. Here, one hidden layer and two hidden layers 
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Fig. 5: The color-magnitude diagrams used to select cluster and field stars, a) The CMD of r vs r — i. Bold 
dots are the possible cluster members and crossings denote the excluded non-cluster sources, b) The CMD 
of g vs g — r, used to assist r vs r — i CMD to pick out cluster stars from candidates, c) The distribution 
of the selected field stars in r vs r — i CDM. d) The r vs r — i CMD, including both the extracted cluster 
members and field stars. The smaller dots are field stars and the larger dots are cluster members. The white 
box with four circles shows the enclosure to decide the number of field star samples. 



with 1, 10, 20, 30 and 40 neurons in the specified networks are investigated. The performance of them is 
showed in Fig.|6] Each network is trained and tested ten times with data set gained in the previous section. 
We stop the algorithm in every concerned network when the test MSB reaches its minimum. As Fig|6]indi- 
cates, with the increasing complication of network configuration, the training and test MSEs descend as a 
whole. And in the same network group (for example Layer 1 = 10), the test MSB always descends at first 
and then ascends when the number of neurons in the second hidden layer increases. No larger difference 
among the BPNNs with 2 hidden layers when the number of neurons in the first hidden layer is lager than 
10. Considering about the test performance and the complexity of the configuration relevant to the speed of 
training, Net[5:10,10,l], which has 5 input elements, 10 neurons in the first hidden layer, 10 in the second 
hidden layer and 1 output, serves as our final model. 

4.3 Detect Tidal Tails With the Best Trained Network 

It's necessary to take into account the effectiveness of the trained networks. So, we construct a function of 
the magnitude (^'pgf ) as the ratio of the mean output for field and cluster stars in the training and test data set. 
Ideally, the ratio should be around zero as long as the networks are trained well. In Fig.|7] the ratio goes up 
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Fig. 6: The mean training and test MSEs of various networks including 1 or 2 hidden layers. The darker 
solid line shows the training MSEs, while the light dotted line shows the test MSEs. The dashed horizontal 
line (MSE = 0.055) presents one level so that it can help us to check the network with the best performance. 



gradually when r^^f becomes fainter No significant deviation is shown above 0. 1 near the magnitude limit. 
This indicates that the selection of the training and test data set, and BPNNs are determined considerably 
well. However, in order to trace out the profile of the tidal tails more evidently, we cut off all the sources 
with a truncation r^^f = 21 with smaller photometric errors. 

All the sources with normalized ugriz magnitudes are imported into the trained ten networks, and the 
mean probability of cluster member for each star from the output layer is calculated. In order to present the 
panorama of the distribution of cluster stars, we divide the whole field into small square bins, whose sizes 
are 6' x 6'. About 90 objects are included in each bin, which is enough for statistical analysis. Here, the 
mean probability in each bin is computed. At the same time, Gaussian smoothing and median filtering are 
used to get rid of noises when detecting the tails in the field. In addition, in order to enhance the resolution 
of the distribution, cubic spline interpolation is also employed. 

A lot of experiments are implemented to investigate the factors impacting the distribution. These tests 
include the parameters used in smoothing tools, the selection of cluster stars and the field stars chosen in 
different regions of the field. As a result, cluster stars within the radius R < 0.13° are appropriate for 
training the networks. Larger R will bring in pollution from field stars, while smaller R yields less cluster 
stars which cannot provide enough information about Palomar 5 members. Thus, the selection criterion of 
cluster stellar candidates in §|4T|is reasonable. Field stellar samples chosen from different regions make no 
difference unless they cover the possible position of tidal tails. However, it does not affect the detection of 
the rest parts. 

Fig.[8]exhibits the contours of 2-dimension probability distribution, where four regions are processed by 
the same best trained network and smoothing techniques. Although smoothed, the distribution is fluctuant 
in the whole field. We find the P{C)s, the probability of cluster member in all bins, obey a Gaussian 
distribution with mean value 0.0488 and standard deviation a = 0.0087. In order to check the tails more 
widely and to find possible longer tails, the contour levels are larger than la above the mean. In this figure, 
M5 is detected at the same time, due to the similarity of their color-magnitude diagrams. The black solid 
line traveling through R2 and R3 provides us with a possible tidal tail away from the main tails in RO 
(226° < a < 236° and —3° < S < 5°). We are not sure that the tail along this line is the real extension 
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Fig. 7: The relation between the ratio of mean outputs for field and cluster stars and the Tp^j magnitude. 

of the tails of Palomar 5, because the foreground noises in the residual regions are so similar to it. Fig. 
|9] shows the contours of the probability distribution, where the contour levels are higher than 1.5o- above 
the mean. The possible exten sion vanishes, but some debris still coexist with foreground noises. However, 
iGrillmair & DionatosI (l2006h insists that the extension is the real tail. Therefore, only the region RO, which 
encloses the clear tidal tails of Palomar 5, is considered as our region of investigation in the next discussion. 
The sub-figure in Fig. |9] shows the smoothed probability distribution with 1.5f7 in RO. There are some 
ignorable differences between the two plots. These petty changes are caused by different bin coordinates 
and by region smoothing. 

5 PROPERTIES OF THE TIDAL TAILS AND THE CLUSTER 



5.1 The Profiles of the Tails 

The subplot in Fig.|9]presents the holistic smoothed distribution of Palomar 5 members. The whole tail lies 
away from the dense regions of the Galactic extinction, which implies that the extinction has little effect 
on our detection. Two tails extend from the core region of Palomar 5 to southwest and northeast directions, 
which we called South Tail (ST) and North Tail (NT), respectively. ST is the leading part facing toward the 
Galactic disk, while NT is the trailing part. The angular distances are 5.42° for NT and 3.77° for ST. NT is 
a little shorter than the 5.8° northern tail detected bv .Odenkirchen et al.. (2003 ), while the lengths of both 
ST are approximately equal. One reason may be that the smooth i ng flat ten the distribution. In addition, the 
longer debris along the north tail detected bv lOdenkirchen et al.l (1200 3') may be the disturbance or noise of 
field stars. However, the main tails we find are very similar to the results of Od enkirchen et al. (2003), while 
we do not support the discovery of any longer tails to the north of the sky that lGrillmair & DionatosI (l2006h 
reported to be a 22° tail of Palomar 5, because the more extensive debris is so semblable to the background 
noises. There is no photometric data in the south, so we cannot cover larger southern regions to make sure 
a longer ST. 




Fig. 8: The contours of the smoothed probability distribution of cluster member with la high above the 
mean. The solid line shows the possible extension, and dashed rectangle RO is the target region we will 
discuss. The overlapped grey background is the reddening distribution of the Galaxy. 



Another fact we can see from the contour map in Fig. |9]is that the distribution of cluster members is 
more dense in NT than that in ST. The possible reason causing this situation may be that; the orbit direction 
of NT is nearer to us than ST; given that the components of both tails are the same, because of their large 
extension, the average magnitude of NT ought to be brighter than that in ST. So, due to detecting limit, 
quite a few cluster members in ST cannot be observed or excluded by data preprocessing, although they are 
detected. However, when investigating all the detected members in both NT and ST, we do not find any large 
magnitude shift by comparing the mean magnitudes of the two tails in any places having the same distance 
from the cluster center The maximum magnitude difference (between the northeast and southwest ends) 
is about 0.1 mag (NT is a little brighter than ST). Qualitative analysis indicates that even if the inclination 
angle of the tails is large, the magnitude difference is small when considering a specified star lying in 
different position in the tails. The above fact tells that it may be true that the track of NT is nearer to us, 
but it cannot change the distinct density difference of both tails obviously. Therefore, this kind of situation 
should be referred to some dynamical processes, which will be studied in future. 

We convert the probability distribution to surface density distribution. First, we count the numbers of 
stars in all square bins. Then, the areas of all bins are calculated. In this way, with the smoothed probabilities 
of cluster member, we get the smoothed surface density distribution. Fig. llOal gives the transformed surface 
density contour map which is interpolated by the technique of cubic spline interpolation. Fig. llOal indicates 
that there are no obvious changes compared with the probability distribution contours. Furthermore, the 
noises around the tails and M5 are gotten rid of from the map. From Fig. llOal we can see that there is 
no geo metrical symmetry be tween ST and NT, and it takes on an S shape near the center of Palomar 5. 
In fact, iDehnen et alj (120041) has illustrated this structure in their simulations. Fig. II Obi shows the radial 
surface density profiles of both ST and NT. In Fig. 10b, the numerical values originate from the smoothed 
density . Thus the density is lower than the density profile along the tails discussed by Odenkirchen et alj 
(l2003h . Both of density profiles drop down quickly from the center of Palomar 5. However, it seems to be 
that the density of NT descends not so fast than the ST. In addition, the trailing tail seems to lag behind 
the leading tail, because the density of S T reaches its peak a t R = 1.7°, while NT achieves its minimum. 
This phenomena can be slightly seen in lOdenkirchen et al] (l2003l) when he discussed the radial profiles 




Fig. 9: The contours of the smoothed probabiHty distribution of cluster member with 1.5a higher above 
the mean. The upper-right plot shows the recomputed distribution in the region RO. Detailed specifications 
should be referred to the text. 



of both tails, although he did not mention it. We cannot explain what causes this phenomena yet, but it 
is an interesting result. Possibly, this kind of lag is closely relative to the evolution of the cluster and the 
interaction between Palomar 5 and the Galaxy. Moreover, several stellar clumps emerge in both tails to 
form substructures of the cluster Some of them he at ai'ound R = 0.90° (229.62°, 0.56°), 2.57° (230.94°, 
1.6°), 3.28° (231.56°, 19.6°) and 5.13° (233.28°, 2.76°) in the north, and 1.72° (227.8°, -1.32°) and 3.50° 
(226.5°, —2.54°) in the south. Maybe this kind of substructures and the radial density profiles are formed 
by dynamical processes between the Palomar 5 and the Galaxy. One possible explanation may be as follow: 
Palomar 5 has experienced several encounters with the Galactic disk or bulge since its birth; its body was 
heated by dynamical shocks; then the cluster stars were accelerated, and gradually disrupted and extended 
along the moving direction; as a result, some stars with small mass might esc ape from the tails, and the 
residual ones constitute the substructures and waited for the next shock. In fact, lOdenkirchen et al] (l2003h 
has indicated that Palomar 5 will be totally destroyed after the next disk crossing within about 100 Myr. 

5.2 King Model Fitting and the Luminosity Function 

For density profile near the center, due to the small size of Palomar 5, we take R < 50' as the range for 
discussion. Cluster stars are counted in all bins (from 2' to 6.8', the bin size is 0.3'; from 8' to 15', the bin 
size is 1'; the bin size of rest is 5'), and the area of each annu lus is calculated. Fig. [TT] shows the radial 
surface density profile, and the best fitting of King model (.Kin j,fl962h . King model is expressed as: 

P = k\ r r| , (1) 
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Fig. 10: (a) The surface density contour map of Palomar 5, converted from the smoothed probability dis- 
tribution map. Here, M5 and noises around the tails are erased, (b) The radial profiles of surface density 
(deg^^) for both tails. The interval of the radial distance is 0.05°. 



where A; is a constant, Vc is the core radius, rt is the tidal radius and R is the distance away from the center 
of the cluster. Thus, the centric density is 

Po = k\l ^- - \ . (2) 

c = rt/rc is the concentration ratio. We fit the data subtracted from the background density (about 0.11) 
and obtain k — 114.9, Tc = 1.60', rt — 14.29'. Therefore, the concentration ratio c is 8.95 and po is 90.81 
arcmin^ . We find that the radii obtained in this paper are smaller than those of iHarrisl(il996h : = 3.25' 
and rt — 16.28'. From Fig. 1 1, we can see clearly that the King model cannot fit the observed profile at the 
outer regions (R > 8') because of the tidal tails. 

Luminosity function (LF) of the cluster reflects its mass distribution. We examine whether the lumi- 
nosity functions of the cluster itself and the tidal tails are the same, and try to verify that the tidal tails 
come from Palomar 5. We examine the luminosity functions in three regions: the north tail (N), the south 
tail (SI & S2) and the Palomar 5 (C). Fig. |12a| shows the boundaries of these regions. Radius R = 10' is 
adopted for Palomar 5, and is near enough to the center of the cluster to eliminate the contamination from 
the tails. Radius R > 0.5°, far enough away from the center, separates the tails into N for the north tail 
and SI & S2 for the south tail. Star counts are taken to deduce the LFs of the ST and NT in these regions. 
We only consider the stars with high probability of cluster member (> 0.5), which subsequently subtracts 
the contamination of foreground field stars. Consequently, in Fig. |12b[ there are 4 LFs of ST, NT, total tails 
and the cluster itself, and all of them are rescaled to match the LF of the cluster. In the magnitude range of 
19.0 < Tpgf < 20.0, LFs ai-e rescaled by factors 19.41, 22.43, 20.35 for NT, ST and total tails, respectively. 
On the whole, there is little difference among the LFs when Vpsf < 20.5. As rpsf > 21, th e LF of the 
cluster lies lower than those of tails. This case confirms so-called 'mass segregation effect' (*K och et all 
|2004), which shows that cluster members with big mass will accumulate near the center because of the loss 
of kinetic energy when colliding with others, while stars with small mass would escape from the cluster 
into its tails. Thus, these luminosity functions reveal the fact that the stars in the tails come from Palomar 5, 
and some relevant physical properties did not change much in its history. 



6 CONCLUSION 



In this paper, we present a new method. Back Propagation Neural Network, to de tect the tidal tails of 
globular cluster Palomar 5. Although some approaches such as matched-filter method (iRockosi et al.ll2002l) 
are widely applied to identifying the tails, we choose BPNN as our model to find the exact and distinct tails 
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Fig. 1 1 : The radial surface distribution near the center of Palomar 5. The circles represent the surface densi- 
ties in all bins figured out by counting the cluster stars. The King model with the mean background number 
density (0.11) is drawn in the solid Une. The coordinate system is a logarithmic system and the unit of 
density is arcmin"^. 



of Palomar 5. The photometric magnitudes of 5 bands (ugriz) in the SDSS DR6 are the unique inputs and 
consequently the probability of cluster member is the output in a best trained BPNN. BPNN resembles a 
black box, and we need not consider its detailed inner structure. The only thing we should do is to give it a 
set of well-selected cluster and field stars (they may not be completely accurate) as a teacher to make BPNN 
learn the knowledge. After gaining information, BPNN can estimate the probability of the cluster member. 

First of all, we obtain about 15,305,060 objects in a 40 x 20 deg^ field ( 220° <a< 260° and -5° < 
d < 15°). Considering the effectiveness of star/galaxy classification in the SDSS and the completeness 
of the observation, we leave behind about 4,082,662 point sources (stars) with 14 < r^^f < 22 after 
reddening correction and eliminating the pollution from galaxies as much as possible with the help of the 
distribution map of fp^f — rexp- Next, we make use of surface density and CMDs to extract cluster stars 
from candidates, which lie in the circle where R < 0.13°. And field stars used to be trained are chosen 
by making equal numbers of both cluster and field stars around the turnoff of main-sequence of Palomar 
5. In this way, about 960 cluster stars and 6800 field stars are kept aside as the training and test data set 
to provide their inherent characteristic information for BPNN. With the training and test data, the best 
parameters and structures of BPNN are determined. Then the best trained BPNN with 5 nodes in the input 
layer, 10 neurons in the first hidden layer, 10 neurons in the second hidden layer and 1 neuron as output, is 
gained to compute the probability estimation of cluster member for each point source. We divide the field 
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Fig. 12: (a) The boundaries used to calculate the LFs. C is the boundary of cluster itself with R < 10'; N 
is the boundary of the north tail; S1+S2 is the boundary of the south tail. The boundaries of both NT and 
ST are far away from the center with R > 0.5°. (b) Different luminosity functions. All LFs are rescaled by 
some factors to match the LF of the cluster. 



into bins with size 6' x 6' to calculate the mean probability distribution. The impact of the selection of field 
stars is also investigated and their effect is not important. 

S-shape tidal tails are detected, which subtend towards northeast and southwest from the center of the 
cluster: the traiUng tail and the leading tail, respectively. The angular distances are 5.42° for the north tail 
and 3.77° for the south one. At the same time, there are some density clumps as the substructures of Palomar 
5 in both tails. We cannot find any longer stretch for the NT if we do not regard the extension far away as 
tails, and we cannot confirm whether the ST has a longer spread because no photometric data are available 
outside to the southwest. We also find an interesting phenomenon from the radial profile of the density: 
the NT seems to lag behind the ST like wave propagation, which may be caused by the tidal shocks when 
Palomar 5 crossed the Galactic disk or bugle in its history. In addition, we fit the radial density profile near 
the cluster center with the King model and find that the model can fit this kind of remote globular clusters 
with low density very well when the radial distance is less than 8'. However, when the radial distance 
becomes larger, the density drops more slowly due to the tidal tails. The tidal radius obtained in this paper 
is 14.29' and the core radius is 1.68'. Luminosity functions of both tails and the cluster are also determined. 
We find that there is little difference among the LFs of both tails coming from the original cluster, and their 
properties have not changed evidently during their lives. 
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Appendix A: THE MECHANISM OF BPNN 

For clarity and continuity of our work, a BPNN with two hidden layers is demonstrated below (Fig. lA.lk 
This network contains input layer (/ in the figure), hidden layers (LI and L2) and output layer (O) generally. 
Input layer reads training or test patterns (input patterns), which are offered to be processed by hidden layers 
and output layer yields the relevant output results. In our paper, input patterns are corresponding to 5-band 
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magnitudes of cluster and field stars. The desired output patterns (target patterns) placed at T in Fig. lA.ll 
are 1 for cluster stars or for field ones. The output of BPNN gives the probability of cluster member for 
each star. 

The network is executed in two phases: training phase and test phase. In training phase, input and target 
patterns are submitted to the network. And then two processes, feeding forward and error back propagation, 
are performed. After we endow this network with an initial state, an input sample (pattern) travels from the 
input layer Via being treated by intermediate layers, the information stored by weights is processed and 
a corresponding result comes forth at the output layer. There, comparing the network output result with 
the relevant target pattern, an error performance is calculated. By this error item, error back propagation is 
carried out from output to input layer to modify the connected weights and biases (in Fig. lA.lb . and store 
learned knowledge at the same time. Then, the remain patterns act in the same way and the iteration goes 
on until the satisfaction of preplanned error limit. There are two modes of updating weights and biases: one 
is incremental mode in which the weights and biases update when the errors of patterns back-propagate one 
by one as presented above and the other is batch mode in which all patterns travel through the network and 
the total error is counted, then the weights and biases are renewed once in an iteration. We call an iteration 
of processing all patterns as one 'epoch'. In test phase, patterns, which have not been seen by the network, 
are given to check the efficiency and accuracy of the network configured in training phase. 

The detailed description of the network configuration and concise mathematics of training it are pre- 
sented below. In the left panel of Fig lA.ll the input layer and target segment are divided by dashed lines, 
which are linked to exoteric environment. The nodes in the hidden and output layers are called neurons. The 
right panel of Fig lA.l [ gives the delicate structure of one neuron. There are p neurons from the previous layer 
as inputs connecting to the neuron enclosed by a dashed rectangle, where each connection has a weight w. 
In the neuron, an adder J2 performs to sum all the input values and bias 9 and transmits the result ti to a 
transfer function /, which is to produce an output y. Expressions can be presented as 



p 

V = "^WiXi - 6 = w'x - 6, (A.l) 

1=1 

y = f{v), (A.2) 



where Xi is the output of the ith node from the previous layer and Wi is the corresponding weight, and 
X = [xi,X2, ■ ■ ■ , Xp]' ,w = [wi, W2, ■ ■ ■ , WpY . More vividly to say, the stimulus v goes beyond the bias 
(6), the neuron will be activated to release an output signal to next neurons. Here, 9 can be arranged into 
the weight vector w, as long as we take into account another constant input of the node. That is to say, we 
introduce xo — —1 and wq = 9 and let the network adjust 9 just like weights, so that v has the form of 
V — w'x, where x = [xq, xi, . . . , Xp]' , w = [wq, wi, . . . , Wp]' . Besides, the transfer function / has various 
forms, such as: 

Linear: f = v, (A. 3) 

Log-sigmoid: / = Y~p~^' ^^■'^^ 
2 

Tan-sigmoid: / = -— — ^ - 1, (A.5) 

1 + e 

where — cxd < v < +oo. Among these functions, the log-sigmoid transfer function, which yields results 
in the range from to 1, is commonly used in back-propagation networks partly because of its unlimited 
differentiability. We will adopt this kind of transfer functions (Equation lA.4b in our present study. 

Now, turn back to training the network. We will select the batch training mode in this paper In this 
mode, weights and bias will be updated only after the entire inputs and targets are submitted, and as a result, 
the gradients (quantitative changes of weights and bias) are averaged together to produce more accurate 
estimates. In this case, a performance function known as mean squared error is used to evaluate the outputs 
of the network. The MSB is expressed as a formula 

^('^) = ^ E(^^ - - ^(T - 0)'(T - O) = 1|1T - 0|p, (A.6) 
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where N is the numbers of input-target pairs, h is the dimension of output vector O, T is the target pattern 
vector, Oi and Ti are the components of O and T, and w contains all the weights and bias in the BPNN. 
Additionally, target patterns are 1 or 0. Only one output neuron is needed, so = 1 in this paper 

Given the performance function, a training algorithm should be provided to teach the BPNN to learn 
how to classify. During training, our aim is to make the outputs approach the target patterns as much as 
possible, resulting in decreasing the value of MSE. That is, in order to adjust weights for better learning, we 
need to decrease the value of MSE or to minimize this performance function epoch by epoch. Consequently, 
an universal training scheme to update w comes up as 

Ah'(^) = H-C^+i) - H'(^) = r^^'^U^"') (A.7) 



or 

(A.8) 

where k denotes the kth epoch of training, the positive rj is the learning rate which decides the step length of 
changes of w, and d is the search direction where w moves. All the training algorithms of back propagation 
network are variations of the above form. For example, the most basic algorithm. Steepest Descent BP 
(SDBP), is based on the negative gradient of E{w) as d. Thus, the learning rule becomes 

Awik) = _^(fe)v£;(H') |w=w('=), (A.9) 

where VE{w) is the differential of E. In this training algorithm, the d etailed modify ing formula of w 
in each layer is obtained by chain rule ( Hangan. Demuth & Beale, 1996t iHavkinl Il998l § 1 1.9), and rel- 



ative learning rate can be optimized ('Han gan. Demuth & Bealg il996, § 9.6 & 12.12). There are other 
traini ng algorithms to train BPNN: Backpropagation with Momentum (MOBP) (Hangan, Demuth & Beale^ 
1 19961 § 12.9), Conjugate Gradient Backpropagation (CGBP) dHangan. Demuth & Bealel 1 19961 § 
9.15 & 12.15}, Newton Method using the H essian matrix (second derivatives) of the performance 
as direction jHa ngan, Demuth & Bealel Il996[ § 9.10), Levenberg-Marquardt Backprogation (LMBP) 
(iHangan. Demuth & Bealel 119961 § 12.19) and so on. Here, we chose LMBP as our training algorithm 
becaus e of its speediest convergence. Detailed algorithm about LMBP can also be referred to iBall et al.l 
(12004 . 
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